{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "#**Summary**\n",
        "\n",
        "This notebook outlines the data cleaning process that transforms my Qualtrics datasets from Malawi and Kenya into one combined dataframe for analysis and visualisation in Rstudio. The core of this notebook is a function that I have created for cleaning the raw Qualtrics data. As a general overview, in this notebook:  \n",
        "\n",
        "1.   I first discuss what my data cleaning function is doing in detail.\n",
        "2.   I then run the function on the Malawi and Kenya Qualtrics datasets.\n",
        "3.   I combine the cleaned Malawi and Kenya, reindexing as needed.\n",
        "4.   I create two additional variables (on the combined df) for later analyses related to the supplemental appendix.\n",
        "\n"
      ],
      "metadata": {
        "id": "xvpLZy5V3AiJ"
      }
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_YX7IaBXBgyC"
      },
      "source": [
        "##Libraries\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "VmMZLUf9Dcie"
      },
      "source": [
        "import os            #for file management and working directories\n",
        "import pandas as pd  #for reading and manipulating datasets\n",
        "import numpy as np   #for general data manipulating"
      ],
      "execution_count": 1,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "O3ZcEKr5DmlF",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "746fa3f2-8ba2-4a3e-c624-7e1a304f457c"
      },
      "source": [
        "#I used Google Colab, for ease of coding across devices.\n",
        "from google.colab import drive\n",
        "drive.mount('/content/drive')\n",
        "\n",
        "#Otherwise to open from your computer directory use the below code and input the folder path:\n",
        "#os.chdir(folder_path)"
      ],
      "execution_count": 2,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Mounted at /content/drive\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "#Data Cleaning Overview\n",
        "\n",
        "---\n",
        "The structure of the Qualtrics data is wide. Each row is a single respondent and for each respondent-row, there are 3 rounds (sets of columns) of conjoint responses. The firt challenges is transforming that data into a long format.\n",
        "\n",
        "A second challenge is the fact that Qualtrics only records the name of the form (between \"Form A\" vs. \"Form B\") that was clicked by the participant on the forced-choice screen. So I am missing the respondent-rows for the unselected outcome.\n",
        "\n",
        "My solution duplicates rows to create the observation for non-selection and update the duplicate row with the treatment information pertaining to the non-selected form. In the following code, I do this step by step in for all three runs of the conjoint for Kenya and for Malawi, so the code is somewhat repetitive but this allows for continual checking of information.\n",
        "\n",
        "\n"
      ],
      "metadata": {
        "id": "XjHPHKVnn9vb"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Qualtrics Dictionaries**\n",
        "\n",
        "A technical point: when images are uploaded to a Qualtrics library, they are given new names. Qualtrics maintains a table of the new names associated with each uploaded image (like a dictionary), which I stored as an excel spreadsheet. The original names of my images are **very** important because they contain a numeric code that describes the treatment prescribed in the image. I add this information to explain the reason and source of the decoding dictionaries below."
      ],
      "metadata": {
        "id": "pdstOt6PSiyZ"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#importing the Kenya dictionary (fill in the file path segment)\n",
        "qualtrics_k = pd.read_csv('/filepath/Qualtrics_Dictionary_KY.csv', names=['loc_name', 'qul_name'])\n",
        "qualtrics_dct_k = dict(zip(qualtrics_k.qul_name, qualtrics_k.loc_name))  # creation of the dictionary to translate Qualtrics names\n",
        "\n",
        "#importing the Malawi dictionary (fill in the file path segment)\n",
        "qualtrics_m = pd.read_csv('/filepath/Qualtrics_Dictionary_MW.csv', names=['loc_name', 'qul_name'])\n",
        "qualtrics_dct_m = dict(zip(qualtrics_m.qul_name, qualtrics_m.loc_name))"
      ],
      "metadata": {
        "id": "3Q8kj4BC-CrI"
      },
      "execution_count": 3,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "#Data Importation"
      ],
      "metadata": {
        "id": "Iyv5doXwJ1g7"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#read-in Qualtrics data from Kenya and Malawi surveys (fill in file path)\n",
        "df_k = pd.read_csv(\"/filepath/AP21_Kenya.csv\")\n",
        "df_m = pd.read_csv(\"/filepath/AP21_Malawi\")"
      ],
      "metadata": {
        "id": "PvyIDvBxoAWh"
      },
      "execution_count": 4,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#create id column\n",
        "df_k.insert(0, 'id', df_k.index)\n",
        "df_m.insert(0, 'id', df_m.index)"
      ],
      "metadata": {
        "id": "q3lXGekq8Su4"
      },
      "execution_count": 5,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [],
      "metadata": {
        "id": "ChgaETRcDaF4"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#rename columns\n",
        "df_k.columns = ['id', 'age', 'educ', 'gender', 'ethnicty', 'reg_vote', 'vot_2017',\n",
        "       'close_to_pty', 'party', 'trust_EMB', 'chf_j_kq', 'monitor_kq',\n",
        "       ' rec_mishaps', 'direct_voters', 'counts_recs_vot', 'protect_int_party',\n",
        "       'r1_fchoice', 'r1_like_A', 'r1A_tally_chg', 'r1A_unprt_int',\n",
        "       'r1A_p_excess', 'r1A_po_prs', 'r1_like_B', 'r1B_tally_chg',\n",
        "       'r1B_unprt_int', 'r1B_p_excess', 'r1B_po_prs', 'r2_fchoice',\n",
        "       'r2_like_A', 'r2A_tally_chg', 'r2A_unprt_int', 'r2A_p_excess',\n",
        "       'r2A_po_prs', 'r2_like_B', 'r2B_tally_chg', 'r2B_unprt_int',\n",
        "       'r2B_p_excess', 'r2B_po_prs', 'r3_fchoice', 'r3_like_A',\n",
        "       'r3A_tally_chg', 'r3A_unprt_int', 'r3A_p_excess', 'r3A_po_prs',\n",
        "       'r3_like_B', 'r3B_tally_chg', 'r3B_unprt_int', 'r3B_p_excess',\n",
        "       'r3B_po_prs', 'duration ', 'i1', 'i2', 'i3', 'i4', 'i5', 'i6']\n",
        "\n",
        "df_m.columns = ['id', 'age', 'educ', 'gender', 'ethnicty', 'reg_vote', 'vot_2017',\n",
        "       'close_to_pty', 'party', 'trust_EMB', 'chf_j_kq', 'monitor_kq',\n",
        "       ' rec_mishaps', 'direct_voters', 'counts_recs_vot', 'protect_int_party',\n",
        "       'r1_fchoice', 'r1_like_A', 'r1A_tally_chg', 'r1A_unprt_int',\n",
        "       'r1A_p_excess', 'r1A_po_prs', 'r1_like_B', 'r1B_tally_chg',\n",
        "       'r1B_unprt_int', 'r1B_p_excess', 'r1B_po_prs', 'r2_fchoice',\n",
        "       'r2_like_A', 'r2A_tally_chg', 'r2A_unprt_int', 'r2A_p_excess',\n",
        "       'r2A_po_prs', 'r2_like_B', 'r2B_tally_chg', 'r2B_unprt_int',\n",
        "       'r2B_p_excess', 'r2B_po_prs', 'r3_fchoice', 'r3_like_A',\n",
        "       'r3A_tally_chg', 'r3A_unprt_int', 'r3A_p_excess', 'r3A_po_prs',\n",
        "       'r3_like_B', 'r3B_tally_chg', 'r3B_unprt_int', 'r3B_p_excess',\n",
        "       'r3B_po_prs', 'duration ', 'i1', 'i2', 'i3', 'i4', 'i5', 'i6']"
      ],
      "metadata": {
        "id": "G89TtvZz8ad-"
      },
      "execution_count": 11,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#Quality test: check for duplicates\n",
        "test = df_k.iloc[:, 0:10]\n",
        "test2 = df_m.iloc[:, 0:10]\n",
        "\n",
        "test.duplicated().value_counts()\n",
        "test2.duplicated().value_counts()"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "QqHdRBSpx-6d",
        "outputId": "0f557830-b658-47b9-b5bd-83efbe0e0c46"
      },
      "execution_count": 7,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "False    140\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 7
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Adding images names from Qualtrics dictionaries**"
      ],
      "metadata": {
        "id": "UqkoB96POg6O"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df_k['i1'].replace(qualtrics_dct_k, inplace=True)\n",
        "df_k['i2'].replace(qualtrics_dct_k, inplace=True)\n",
        "df_k['i3'].replace(qualtrics_dct_k, inplace=True)\n",
        "df_k['i4'].replace(qualtrics_dct_k, inplace=True)\n",
        "df_k['i5'].replace(qualtrics_dct_k, inplace=True)\n",
        "df_k['i6'].replace(qualtrics_dct_k, inplace=True)\n",
        "\n",
        "df_m['i1'].replace(qualtrics_dct_m, inplace=True)\n",
        "df_m['i2'].replace(qualtrics_dct_m, inplace=True)\n",
        "df_m['i3'].replace(qualtrics_dct_m, inplace=True)\n",
        "df_m['i4'].replace(qualtrics_dct_m, inplace=True)\n",
        "df_m['i5'].replace(qualtrics_dct_m, inplace=True)\n",
        "df_m['i6'].replace(qualtrics_dct_m, inplace=True)"
      ],
      "metadata": {
        "id": "LZDJQBJ1ya0g"
      },
      "execution_count": 8,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Preview of Image Renaming:**\n",
        "This shows the structure of my image labels. The first part is a country code. The second part is a label indicating whether the images was of a clean (error-free) election form or an errored one. The number attached to this string is the 'picture number' - which is where it stands among the 60+ unique images created for the experiment. The last part of the image name is the most important. This is the treatment label, with each number representing a unique treatment.\n"
      ],
      "metadata": {
        "id": "B5QNfugpc3xs"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df_k[['id', 'i1', 'i2', 'i3', 'i4', 'i5', 'i6']].head(5)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 204
        },
        "id": "mluvVBf00wzq",
        "outputId": "a2a9b4df-7bac-48fd-b6c4-b370cfcdd3e0"
      },
      "execution_count": 12,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   id                   i1                    i2                    i3  \\\n",
              "0   0  KY_Errors66_256.png   KY_Errors17_246.png      KY_Clean18_1.png   \n",
              "1   1  KY_Errors25_356.png   KY_Errors15_145.png     KY_Clean10_24.png   \n",
              "2   2  KY_Errors60_346.png    KY_Errors52_16.png    KY_Clean28_123.png   \n",
              "3   3    KY_Errors3_56.png   KY_Errors69_126.png  KY_Errors83_2456.png   \n",
              "4   4      KY_Clean5_4.png  KY_Errors33_1236.png   KY_Errors15_145.png   \n",
              "\n",
              "                     i4                   i5                    i6  \n",
              "0      KY_Clean9_23.png  KY_Errors23_156.png   KY_Errors68_125.png  \n",
              "1   KY_Errors19_456.png  KY_Errors12_346.png     KY_Clean22_12.png  \n",
              "2    KY_Errors57_45.png   KY_Clean31_234.png      KY_Clean20_3.png  \n",
              "3     KY_Clean27_34.png   KY_Clean14_134.png  KY_Errors40_3456.png  \n",
              "4  KY_Errors85_2356.png   KY_Clean15_234.png       KY_Clean5_4.png  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-c774d5d0-16af-4e0a-8ebc-f3e14f7a6cc8\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>i1</th>\n",
              "      <th>i2</th>\n",
              "      <th>i3</th>\n",
              "      <th>i4</th>\n",
              "      <th>i5</th>\n",
              "      <th>i6</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>KY_Errors66_256.png</td>\n",
              "      <td>KY_Errors17_246.png</td>\n",
              "      <td>KY_Clean18_1.png</td>\n",
              "      <td>KY_Clean9_23.png</td>\n",
              "      <td>KY_Errors23_156.png</td>\n",
              "      <td>KY_Errors68_125.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>KY_Errors25_356.png</td>\n",
              "      <td>KY_Errors15_145.png</td>\n",
              "      <td>KY_Clean10_24.png</td>\n",
              "      <td>KY_Errors19_456.png</td>\n",
              "      <td>KY_Errors12_346.png</td>\n",
              "      <td>KY_Clean22_12.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>KY_Errors60_346.png</td>\n",
              "      <td>KY_Errors52_16.png</td>\n",
              "      <td>KY_Clean28_123.png</td>\n",
              "      <td>KY_Errors57_45.png</td>\n",
              "      <td>KY_Clean31_234.png</td>\n",
              "      <td>KY_Clean20_3.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>3</td>\n",
              "      <td>KY_Errors3_56.png</td>\n",
              "      <td>KY_Errors69_126.png</td>\n",
              "      <td>KY_Errors83_2456.png</td>\n",
              "      <td>KY_Clean27_34.png</td>\n",
              "      <td>KY_Clean14_134.png</td>\n",
              "      <td>KY_Errors40_3456.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>4</td>\n",
              "      <td>KY_Clean5_4.png</td>\n",
              "      <td>KY_Errors33_1236.png</td>\n",
              "      <td>KY_Errors15_145.png</td>\n",
              "      <td>KY_Errors85_2356.png</td>\n",
              "      <td>KY_Clean15_234.png</td>\n",
              "      <td>KY_Clean5_4.png</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c774d5d0-16af-4e0a-8ebc-f3e14f7a6cc8')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-c774d5d0-16af-4e0a-8ebc-f3e14f7a6cc8 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-c774d5d0-16af-4e0a-8ebc-f3e14f7a6cc8');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-d004701b-61c4-424b-a45f-6bc8ff5eff16\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-d004701b-61c4-424b-a45f-6bc8ff5eff16')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-d004701b-61c4-424b-a45f-6bc8ff5eff16 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 12
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "**General Preview of Dataset:**"
      ],
      "metadata": {
        "id": "hW8F-pb2fDj4"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df_k.head(2)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 262
        },
        "id": "T8Z0vAqbWBOw",
        "outputId": "d5bd388c-4798-455a-87c3-56ab2a8f5752"
      },
      "execution_count": 10,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   id  age                  educ gender ethnicty reg_vote vot_2017  \\\n",
              "0   0   24  University completed   Male      Luo      Yes      Yes   \n",
              "1   1   55  University completed   Male      Luo      Yes      Yes   \n",
              "\n",
              "  close_to_pty                             party              trust_EMB  ...  \\\n",
              "0          Yes  Orange Democratic Movement (ODM)  I somewhat trust them  ...   \n",
              "1          Yes  Orange Democratic Movement (ODM)    I do not trust them  ...   \n",
              "\n",
              "       r3B_unprt_int       r3B_p_excess         r3B_po_prs duration   \\\n",
              "0  Somewhat disagree     Strongly Agree     Strongly Agree       604   \n",
              "1  Somewhat disagree  Somewhat disagree  Strongly disagree       677   \n",
              "\n",
              "                    i1                   i2                 i3  \\\n",
              "0  KY_Errors66_256.png  KY_Errors17_246.png   KY_Clean18_1.png   \n",
              "1  KY_Errors25_356.png  KY_Errors15_145.png  KY_Clean10_24.png   \n",
              "\n",
              "                    i4                   i5                   i6  \n",
              "0     KY_Clean9_23.png  KY_Errors23_156.png  KY_Errors68_125.png  \n",
              "1  KY_Errors19_456.png  KY_Errors12_346.png    KY_Clean22_12.png  \n",
              "\n",
              "[2 rows x 56 columns]"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-c0cb8284-dd02-4346-99c1-c5bbc7d6a81c\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>age</th>\n",
              "      <th>educ</th>\n",
              "      <th>gender</th>\n",
              "      <th>ethnicty</th>\n",
              "      <th>reg_vote</th>\n",
              "      <th>vot_2017</th>\n",
              "      <th>close_to_pty</th>\n",
              "      <th>party</th>\n",
              "      <th>trust_EMB</th>\n",
              "      <th>...</th>\n",
              "      <th>r3B_unprt_int</th>\n",
              "      <th>r3B_p_excess</th>\n",
              "      <th>r3B_po_prs</th>\n",
              "      <th>duration</th>\n",
              "      <th>i1</th>\n",
              "      <th>i2</th>\n",
              "      <th>i3</th>\n",
              "      <th>i4</th>\n",
              "      <th>i5</th>\n",
              "      <th>i6</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>University completed</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Orange Democratic Movement (ODM)</td>\n",
              "      <td>I somewhat trust them</td>\n",
              "      <td>...</td>\n",
              "      <td>Somewhat disagree</td>\n",
              "      <td>Strongly Agree</td>\n",
              "      <td>Strongly Agree</td>\n",
              "      <td>604</td>\n",
              "      <td>KY_Errors66_256.png</td>\n",
              "      <td>KY_Errors17_246.png</td>\n",
              "      <td>KY_Clean18_1.png</td>\n",
              "      <td>KY_Clean9_23.png</td>\n",
              "      <td>KY_Errors23_156.png</td>\n",
              "      <td>KY_Errors68_125.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>University completed</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Orange Democratic Movement (ODM)</td>\n",
              "      <td>I do not trust them</td>\n",
              "      <td>...</td>\n",
              "      <td>Somewhat disagree</td>\n",
              "      <td>Somewhat disagree</td>\n",
              "      <td>Strongly disagree</td>\n",
              "      <td>677</td>\n",
              "      <td>KY_Errors25_356.png</td>\n",
              "      <td>KY_Errors15_145.png</td>\n",
              "      <td>KY_Clean10_24.png</td>\n",
              "      <td>KY_Errors19_456.png</td>\n",
              "      <td>KY_Errors12_346.png</td>\n",
              "      <td>KY_Clean22_12.png</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>2 rows × 56 columns</p>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c0cb8284-dd02-4346-99c1-c5bbc7d6a81c')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-c0cb8284-dd02-4346-99c1-c5bbc7d6a81c button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-c0cb8284-dd02-4346-99c1-c5bbc7d6a81c');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-3540f767-db02-4831-a028-fdf01151a855\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-3540f767-db02-4831-a028-fdf01151a855')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-3540f767-db02-4831-a028-fdf01151a855 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 10
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "###**Function Break Down (using Kenya data as an example)**\n",
        "\n",
        "Here I break down the steps of my data cleaning function (which is found later in this notebook) for clarity. I do this for one complete round of the conjoint experiment for Kenya in detail. I discuss some steps more than others as if pertain to important outcomes for the final analysis of results.\n",
        "\n",
        "\n"
      ],
      "metadata": {
        "id": "OcDdtkkG_isS"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#subseting data for first round. The main outcome variable are 'r1_fchoice', 'r1_like_A', 'r1_like_B' (others are attached later).\n",
        "#r1_fchoice denotes which form was clicked during in the forced-choice. It is for this variable I need to extract a row for the un-selected form.\n",
        "\n",
        "df1 = df_k[['id', 'age', 'educ', 'gender', 'ethnicty', 'reg_vote', 'vot_2017',\n",
        "       'close_to_pty', 'party', 'trust_EMB', 'chf_j_kq', 'monitor_kq',\n",
        "       ' rec_mishaps', 'direct_voters', 'counts_recs_vot', 'protect_int_party', 'r1_fchoice', 'r1_like_A', 'r1_like_B']]"
      ],
      "metadata": {
        "id": "BBWB-M_eTJ-O"
      },
      "execution_count": 13,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "df1.head(2)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 233
        },
        "id": "95l41cuuXsCU",
        "outputId": "164423b9-0b15-4d50-8e45-23e31e2c0bd9"
      },
      "execution_count": 14,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   id  age                  educ gender ethnicty reg_vote vot_2017  \\\n",
              "0   0   24  University completed   Male      Luo      Yes      Yes   \n",
              "1   1   55  University completed   Male      Luo      Yes      Yes   \n",
              "\n",
              "  close_to_pty                             party              trust_EMB  \\\n",
              "0          Yes  Orange Democratic Movement (ODM)  I somewhat trust them   \n",
              "1          Yes  Orange Democratic Movement (ODM)    I do not trust them   \n",
              "\n",
              "       chf_j_kq                          monitor_kq         rec_mishaps  \\\n",
              "0  David Maraga  Elections Observation Group (ELOG)                IEBC   \n",
              "1  David Maraga  Elections Observation Group (ELOG)  Domestic Observers   \n",
              "\n",
              "  direct_voters counts_recs_vot  protect_int_party r1_fchoice  \\\n",
              "0          IEBC            IEBC  Political parties     Form B   \n",
              "1          IEBC            IEBC  Political parties     Form A   \n",
              "\n",
              "         r1_like_A      r1_like_B  \n",
              "0      Very likely  Very unlikely  \n",
              "1  Somewhat likely    Very likely  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-79bd03d8-2907-4b36-a34b-8f1ac79e3d18\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>age</th>\n",
              "      <th>educ</th>\n",
              "      <th>gender</th>\n",
              "      <th>ethnicty</th>\n",
              "      <th>reg_vote</th>\n",
              "      <th>vot_2017</th>\n",
              "      <th>close_to_pty</th>\n",
              "      <th>party</th>\n",
              "      <th>trust_EMB</th>\n",
              "      <th>chf_j_kq</th>\n",
              "      <th>monitor_kq</th>\n",
              "      <th>rec_mishaps</th>\n",
              "      <th>direct_voters</th>\n",
              "      <th>counts_recs_vot</th>\n",
              "      <th>protect_int_party</th>\n",
              "      <th>r1_fchoice</th>\n",
              "      <th>r1_like_A</th>\n",
              "      <th>r1_like_B</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>University completed</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Orange Democratic Movement (ODM)</td>\n",
              "      <td>I somewhat trust them</td>\n",
              "      <td>David Maraga</td>\n",
              "      <td>Elections Observation Group (ELOG)</td>\n",
              "      <td>IEBC</td>\n",
              "      <td>IEBC</td>\n",
              "      <td>IEBC</td>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form B</td>\n",
              "      <td>Very likely</td>\n",
              "      <td>Very unlikely</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>University completed</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Yes</td>\n",
              "      <td>Orange Democratic Movement (ODM)</td>\n",
              "      <td>I do not trust them</td>\n",
              "      <td>David Maraga</td>\n",
              "      <td>Elections Observation Group (ELOG)</td>\n",
              "      <td>Domestic Observers</td>\n",
              "      <td>IEBC</td>\n",
              "      <td>IEBC</td>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>Very likely</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-79bd03d8-2907-4b36-a34b-8f1ac79e3d18')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-79bd03d8-2907-4b36-a34b-8f1ac79e3d18 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-79bd03d8-2907-4b36-a34b-8f1ac79e3d18');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-bb18523d-8a4b-4374-be13-8e2c42763140\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-bb18523d-8a4b-4374-be13-8e2c42763140')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-bb18523d-8a4b-4374-be13-8e2c42763140 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 14
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "#melting the dataframe: I use the fact that 'r1_like_A', 'r1_like_B' are outcomes collected for both the selected and unselected forms to create duplicate rows\n",
        "df1 = pd.melt(df1, id_vars=['id', 'age', 'educ', 'gender', 'ethnicty', 'reg_vote', 'vot_2017',\n",
        "       'close_to_pty', 'party', 'trust_EMB', 'chf_j_kq', 'monitor_kq',\n",
        "       ' rec_mishaps', 'direct_voters', 'counts_recs_vot', 'protect_int_party', 'r1_fchoice'])"
      ],
      "metadata": {
        "id": "UgfAeK9S9zNc"
      },
      "execution_count": 15,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#preview of duplicated row for respondents with id = 0, 1, 2\n",
        "df1[(df1.index >= 0) & (df1.index <= 2) | (df1.index >= 250) & (df1.index <= 252)][['id','age','gender', 'ethnicty','r1_fchoice','variable', 'value']]"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 235
        },
        "id": "9wAKcW_gyCrJ",
        "outputId": "98741ef1-18d0-4a22-b6d7-e6842d3c7a7b"
      },
      "execution_count": 16,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "     id  age  gender ethnicty r1_fchoice   variable            value\n",
              "0     0   24    Male      Luo     Form B  r1_like_A      Very likely\n",
              "1     1   55    Male      Luo     Form A  r1_like_A  Somewhat likely\n",
              "2     2   38  Female   Kikuyu     Form B  r1_like_A  Somewhat likely\n",
              "250   0   24    Male      Luo     Form B  r1_like_B    Very unlikely\n",
              "251   1   55    Male      Luo     Form A  r1_like_B      Very likely\n",
              "252   2   38  Female   Kikuyu     Form B  r1_like_B  Somewhat likely"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-50374f3d-9b78-4527-834c-bc92a7f3d5b5\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>age</th>\n",
              "      <th>gender</th>\n",
              "      <th>ethnicty</th>\n",
              "      <th>r1_fchoice</th>\n",
              "      <th>variable</th>\n",
              "      <th>value</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Very likely</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>38</td>\n",
              "      <td>Female</td>\n",
              "      <td>Kikuyu</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>250</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Very unlikely</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>251</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Very likely</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>252</th>\n",
              "      <td>2</td>\n",
              "      <td>38</td>\n",
              "      <td>Female</td>\n",
              "      <td>Kikuyu</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Somewhat likely</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-50374f3d-9b78-4527-834c-bc92a7f3d5b5')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-50374f3d-9b78-4527-834c-bc92a7f3d5b5 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-50374f3d-9b78-4527-834c-bc92a7f3d5b5');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-88d736ad-ddaf-4342-9cb7-c1d00b924e97\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-88d736ad-ddaf-4342-9cb7-c1d00b924e97')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-88d736ad-ddaf-4342-9cb7-c1d00b924e97 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 16
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "#checking that the forced choice outcome contains only strings of the type 'Form A' or 'Form B' because\n",
        "#these strings will be used in order to create the outcome variables.\n",
        "\n",
        "df1['r1_fchoice'].unique()"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "8WlZFMPOgqIG",
        "outputId": "7d3b9406-c542-4ce6-ebc1-91a909b01986"
      },
      "execution_count": 17,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array(['Form B', 'Form A'], dtype=object)"
            ]
          },
          "metadata": {},
          "execution_count": 17
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "###**Creating a Conjoint Forced-Choice Outcome**\n",
        "\n",
        "I add extra detail here because this step is delicate. As you can see in the preview above, melting the dataframe repeats the forced choice outcome, but the 'r1_like_A' and 'r1_like_B' remain distinct.\n",
        "\n",
        "The way I create a 'chosen' variable is simply to compare the last letter string in the *r1_fchoice* and the last letter string in the *variable* column. Here congruence indicates a 'chosen' form and disimilarity indicates the not chosen form.\n",
        "\n",
        "After creating this chosen variable, I create copy of the r1_fchoice variable (called 'form_proper') where I relabel the form according to its actual form name, using a similar logic of: if the last letter strings match keep the same text, otherwise replace the value of 'form proper' with the last letter string from *variable* (which has the true form name).\n",
        "\n",
        "This is best observed in the preview table following the next chunk of code:"
      ],
      "metadata": {
        "id": "kvXcNUE-eAPa"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#creation of binary forced choice outcome (whether a form of chosen or not)\n",
        "df1['chosen'] = df1.variable.str[-1:] == df1.r1_fchoice.str[-1:]\n",
        "df1.chosen = df1.chosen.astype(int)"
      ],
      "metadata": {
        "id": "3WyHclfEA9a5"
      },
      "execution_count": 18,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Preview:**"
      ],
      "metadata": {
        "id": "FPO0kyOci7p-"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df1[(df1.index >= 0) & (df1.index <= 2) | (df1.index >= 250) & (df1.index <= 252)][['id','age','gender', 'ethnicty','r1_fchoice','variable', 'value', 'chosen']]"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 235
        },
        "id": "xEsW1CqDaM-s",
        "outputId": "98bd5883-e7bf-4093-cf07-d52734a7907f"
      },
      "execution_count": 19,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "     id  age  gender ethnicty r1_fchoice   variable            value  chosen\n",
              "0     0   24    Male      Luo     Form B  r1_like_A      Very likely       0\n",
              "1     1   55    Male      Luo     Form A  r1_like_A  Somewhat likely       1\n",
              "2     2   38  Female   Kikuyu     Form B  r1_like_A  Somewhat likely       0\n",
              "250   0   24    Male      Luo     Form B  r1_like_B    Very unlikely       1\n",
              "251   1   55    Male      Luo     Form A  r1_like_B      Very likely       0\n",
              "252   2   38  Female   Kikuyu     Form B  r1_like_B  Somewhat likely       1"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-724d89c0-cf30-4081-b00e-dff82449086b\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>age</th>\n",
              "      <th>gender</th>\n",
              "      <th>ethnicty</th>\n",
              "      <th>r1_fchoice</th>\n",
              "      <th>variable</th>\n",
              "      <th>value</th>\n",
              "      <th>chosen</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Very likely</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>38</td>\n",
              "      <td>Female</td>\n",
              "      <td>Kikuyu</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>250</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Very unlikely</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>251</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Very likely</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>252</th>\n",
              "      <td>2</td>\n",
              "      <td>38</td>\n",
              "      <td>Female</td>\n",
              "      <td>Kikuyu</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-724d89c0-cf30-4081-b00e-dff82449086b')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-724d89c0-cf30-4081-b00e-dff82449086b button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-724d89c0-cf30-4081-b00e-dff82449086b');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-5d8a4e25-7cd3-4155-8feb-9f16fbc427b7\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-5d8a4e25-7cd3-4155-8feb-9f16fbc427b7')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-5d8a4e25-7cd3-4155-8feb-9f16fbc427b7 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 19
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "#Here I create a variables that contains the proper label of the form associated with each row. This is a VERY IMPORTANT STEP!\n",
        "\n",
        "#create a duplicate column from the r1_fchoice\n",
        "df1['form_proper'] = df1['r1_fchoice']\n",
        "\n",
        "#using the same logic of congruence/incongrunce compare the last letter string of 'form_proper' to the last letter string in 'variable'\n",
        "#if they are different replace the value of 'form_proper' with the last letter string of 'variable'\n",
        "df1['form_proper'].mask(df1.variable.str[-1:] != df1.r1_fchoice.str[-1:], df1.variable.str[-1:], inplace=True)\n",
        "\n",
        "#clean_up the form_proper variable so that it only contains the labels A and B\n",
        "df1['form_proper'] = df1.form_proper.str[-1:]"
      ],
      "metadata": {
        "id": "W_npkHoNpvbC"
      },
      "execution_count": 20,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Preview**"
      ],
      "metadata": {
        "id": "LzIs4OzeqUoE"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df1[(df1.index >= 0) & (df1.index <= 2) | (df1.index >= 250) & (df1.index <= 252)][['id','age','gender', 'ethnicty','r1_fchoice','form_proper','variable', 'value', 'chosen']]"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 235
        },
        "id": "zuypZWJWqPJ-",
        "outputId": "e0dfb3e6-05e5-4ec3-f0ca-5d9a1a838c8e"
      },
      "execution_count": 21,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "     id  age  gender ethnicty r1_fchoice form_proper   variable  \\\n",
              "0     0   24    Male      Luo     Form B           A  r1_like_A   \n",
              "1     1   55    Male      Luo     Form A           A  r1_like_A   \n",
              "2     2   38  Female   Kikuyu     Form B           A  r1_like_A   \n",
              "250   0   24    Male      Luo     Form B           B  r1_like_B   \n",
              "251   1   55    Male      Luo     Form A           B  r1_like_B   \n",
              "252   2   38  Female   Kikuyu     Form B           B  r1_like_B   \n",
              "\n",
              "               value  chosen  \n",
              "0        Very likely       0  \n",
              "1    Somewhat likely       1  \n",
              "2    Somewhat likely       0  \n",
              "250    Very unlikely       1  \n",
              "251      Very likely       0  \n",
              "252  Somewhat likely       1  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-0009c767-1e13-49ee-ab65-2081eae1c6c2\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>age</th>\n",
              "      <th>gender</th>\n",
              "      <th>ethnicty</th>\n",
              "      <th>r1_fchoice</th>\n",
              "      <th>form_proper</th>\n",
              "      <th>variable</th>\n",
              "      <th>value</th>\n",
              "      <th>chosen</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form B</td>\n",
              "      <td>A</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Very likely</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form A</td>\n",
              "      <td>A</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>38</td>\n",
              "      <td>Female</td>\n",
              "      <td>Kikuyu</td>\n",
              "      <td>Form B</td>\n",
              "      <td>A</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>250</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form B</td>\n",
              "      <td>B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Very unlikely</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>251</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>Form A</td>\n",
              "      <td>B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Very likely</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>252</th>\n",
              "      <td>2</td>\n",
              "      <td>38</td>\n",
              "      <td>Female</td>\n",
              "      <td>Kikuyu</td>\n",
              "      <td>Form B</td>\n",
              "      <td>B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-0009c767-1e13-49ee-ab65-2081eae1c6c2')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-0009c767-1e13-49ee-ab65-2081eae1c6c2 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-0009c767-1e13-49ee-ab65-2081eae1c6c2');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-7028da44-c722-4d3c-8aa0-3854a980e5d9\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-7028da44-c722-4d3c-8aa0-3854a980e5d9')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-7028da44-c722-4d3c-8aa0-3854a980e5d9 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 21
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "###**Attaching treatment**"
      ],
      "metadata": {
        "id": "Y0NJIYl4qj6H"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#Now that we have 'form_proper' as a variable that provides the correct form labelling, I use it in attaching the treatment assignment.\n",
        "df1['treatment'] = df1.form_proper.str[-1:]\n",
        "\n",
        "#The columns 'i1' and 'i2' in the original dataset correspond to the treatments on Form A and Form B.\n",
        "#The below code attaches the treatment conditional the letter value\n",
        "df1[\"treatment\"] = np.where(df1[\"treatment\"] == \"A\", df_k.loc[df1.id].i1, df_k.loc[df1.id].i2)."
      ],
      "metadata": {
        "id": "QoaDsQUdpq1I"
      },
      "execution_count": 22,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#preview code\n",
        "df1[(df1.index >= 0) & (df1.index <= 2) | (df1.index >= 250) & (df1.index <= 252)][['id','age','gender', 'ethnicty','form_proper','variable', 'value', 'chosen', 'treatment']]"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 235
        },
        "id": "8KyFF3bKp-GK",
        "outputId": "6d065784-0d0a-4615-8e94-d651ea2bcafa"
      },
      "execution_count": 24,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "     id  age  gender ethnicty form_proper   variable            value  chosen  \\\n",
              "0     0   24    Male      Luo           A  r1_like_A      Very likely       0   \n",
              "1     1   55    Male      Luo           A  r1_like_A  Somewhat likely       1   \n",
              "2     2   38  Female   Kikuyu           A  r1_like_A  Somewhat likely       0   \n",
              "250   0   24    Male      Luo           B  r1_like_B    Very unlikely       1   \n",
              "251   1   55    Male      Luo           B  r1_like_B      Very likely       0   \n",
              "252   2   38  Female   Kikuyu           B  r1_like_B  Somewhat likely       1   \n",
              "\n",
              "               treatment  \n",
              "0    KY_Errors66_256.png  \n",
              "1    KY_Errors25_356.png  \n",
              "2    KY_Errors60_346.png  \n",
              "250  KY_Errors17_246.png  \n",
              "251  KY_Errors15_145.png  \n",
              "252   KY_Errors52_16.png  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-a56261a1-c73c-4c4c-bc10-ae0b702939b1\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>age</th>\n",
              "      <th>gender</th>\n",
              "      <th>ethnicty</th>\n",
              "      <th>form_proper</th>\n",
              "      <th>variable</th>\n",
              "      <th>value</th>\n",
              "      <th>chosen</th>\n",
              "      <th>treatment</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>A</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Very likely</td>\n",
              "      <td>0</td>\n",
              "      <td>KY_Errors66_256.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>A</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>1</td>\n",
              "      <td>KY_Errors25_356.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2</td>\n",
              "      <td>38</td>\n",
              "      <td>Female</td>\n",
              "      <td>Kikuyu</td>\n",
              "      <td>A</td>\n",
              "      <td>r1_like_A</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>0</td>\n",
              "      <td>KY_Errors60_346.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>250</th>\n",
              "      <td>0</td>\n",
              "      <td>24</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Very unlikely</td>\n",
              "      <td>1</td>\n",
              "      <td>KY_Errors17_246.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>251</th>\n",
              "      <td>1</td>\n",
              "      <td>55</td>\n",
              "      <td>Male</td>\n",
              "      <td>Luo</td>\n",
              "      <td>B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Very likely</td>\n",
              "      <td>0</td>\n",
              "      <td>KY_Errors15_145.png</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>252</th>\n",
              "      <td>2</td>\n",
              "      <td>38</td>\n",
              "      <td>Female</td>\n",
              "      <td>Kikuyu</td>\n",
              "      <td>B</td>\n",
              "      <td>r1_like_B</td>\n",
              "      <td>Somewhat likely</td>\n",
              "      <td>1</td>\n",
              "      <td>KY_Errors52_16.png</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-a56261a1-c73c-4c4c-bc10-ae0b702939b1')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-a56261a1-c73c-4c4c-bc10-ae0b702939b1 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-a56261a1-c73c-4c4c-bc10-ae0b702939b1');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-8c34f5c3-b57c-4d5a-980f-8d4bf2168ccf\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-8c34f5c3-b57c-4d5a-980f-8d4bf2168ccf')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-8c34f5c3-b57c-4d5a-980f-8d4bf2168ccf button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 24
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "#extract label from picture name in the 'treatment' variable, this is so that I have a column for quickly identifying pictures\n",
        "df1['label'] = ''\n",
        "df1['label'] = df1.treatment.str.split('_').str[-2]"
      ],
      "metadata": {
        "id": "9G1GXntPqjsK"
      },
      "execution_count": 25,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Treatment Codes**\n",
        "\n",
        "Now I want the treatment code, which is the last string of numbers on the image name. So for example, it is the '256' in 'KY_Errors66_256.png'. This is because this set of numbers encodes which treatments are present on the form image. As a key:\n",
        "\n",
        "*   1= Presiding Officer Present\n",
        "*   2 = DPP Present |  Jubilee Present    (ruling parties, circa 2020)\n",
        "*   3 = MCP Present |  NASA Present       (opposition parties, circa 2020)\n",
        "*   4 = NICE Present | ELOG Present       (independent observers)\n",
        "*   5 = DPP Crossed Out | Jubilee Crossed out (ruling party error)\n",
        "*   6 = MCP Crossed Out | NASA Crossed out (opposition error)\n",
        "\n"
      ],
      "metadata": {
        "id": "NgCnvuET46S0"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#I extract these from the file name because these numbers encode the treatment\n",
        "df1['treatment'] = df1.treatment.str.split('_').str[-1]\n",
        "df1['treatment'] = df1.treatment.str.extract('(\\d+)')"
      ],
      "metadata": {
        "id": "JquY4h-k5D1k"
      },
      "execution_count": 27,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#turn the string into a list\n",
        "df1['treatment'] = df1['treatment'].apply(lambda x: [int(i) for i in x])"
      ],
      "metadata": {
        "id": "ugRRovGK5hh6"
      },
      "execution_count": 28,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#using treatment-list column to create individual columns of binary outcomes\n",
        "df1['presiding'] = df1['treatment'].apply(lambda x: 1 in x)\n",
        "df1['party1'] = df1['treatment'].apply(lambda x: 2 in x)\n",
        "df1['party2'] = df1['treatment'].apply(lambda x: 3 in x)\n",
        "df1['observer'] = df1['treatment'].apply(lambda x: 4 in x)\n",
        "df1['party1_cros'] = df1['treatment'].apply(lambda x: 5 in x)\n",
        "df1['party2_cros'] = df1['treatment'].apply(lambda x: 6 in x)"
      ],
      "metadata": {
        "id": "J9UZGdNBJyZd"
      },
      "execution_count": 29,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#turn the boolean columns into integer columns\n",
        "df1[['presiding', 'party1', 'party2','observer','party1_cros','party2_cros']] = df1[['presiding', 'party1', 'party2','observer','party1_cros','party2_cros']].astype(int)\n"
      ],
      "metadata": {
        "id": "glffeIAsMgL0"
      },
      "execution_count": 30,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "likelihood_dct = {\n",
        "    'Very unlikely': 1,\n",
        "    'Somewhat unlikely':2,\n",
        "    'Neither likely nor unlikely': 3,\n",
        "    'Somewhat likely': 4,\n",
        "    'Very likely': 5}"
      ],
      "metadata": {
        "id": "L5LYsJMgOtnb"
      },
      "execution_count": 31,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#turn likelihood into factor variable\n",
        "df1['value'].replace(likelihood_dct, inplace=True)\n",
        "\n",
        "#cosmetic changes before export\n",
        "df1.rename(columns={'value': 'lik_fraud'}, inplace=True)\n",
        "df1.variable = df1.variable.str[:2]"
      ],
      "metadata": {
        "id": "CYWuwXyzRplq"
      },
      "execution_count": 32,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Preview:**"
      ],
      "metadata": {
        "id": "m_-9VaU21QDO"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df1.iloc[0:3, 15:30]"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 162
        },
        "id": "mf-HLdpn_PkR",
        "outputId": "6e56d936-9ee1-40d0-8904-95a8e450c6e7"
      },
      "execution_count": 33,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   protect_int_party r1_fchoice variable  lik_fraud  chosen form_proper  \\\n",
              "0  Political parties     Form B       r1          5       0           A   \n",
              "1  Political parties     Form A       r1          4       1           A   \n",
              "2  Political parties     Form B       r1          4       0           A   \n",
              "\n",
              "   treatment     label  presiding  party1  party2  observer  party1_cros  \\\n",
              "0  [2, 5, 6]  Errors66          0       1       0         0            1   \n",
              "1  [3, 5, 6]  Errors25          0       0       1         0            1   \n",
              "2  [3, 4, 6]  Errors60          0       0       1         1            0   \n",
              "\n",
              "   party2_cros  \n",
              "0            1  \n",
              "1            1  \n",
              "2            1  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-cdd00747-090b-4d6e-8dc2-e0888d445bd9\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>protect_int_party</th>\n",
              "      <th>r1_fchoice</th>\n",
              "      <th>variable</th>\n",
              "      <th>lik_fraud</th>\n",
              "      <th>chosen</th>\n",
              "      <th>form_proper</th>\n",
              "      <th>treatment</th>\n",
              "      <th>label</th>\n",
              "      <th>presiding</th>\n",
              "      <th>party1</th>\n",
              "      <th>party2</th>\n",
              "      <th>observer</th>\n",
              "      <th>party1_cros</th>\n",
              "      <th>party2_cros</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1</td>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>A</td>\n",
              "      <td>[2, 5, 6]</td>\n",
              "      <td>Errors66</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r1</td>\n",
              "      <td>4</td>\n",
              "      <td>1</td>\n",
              "      <td>A</td>\n",
              "      <td>[3, 5, 6]</td>\n",
              "      <td>Errors25</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1</td>\n",
              "      <td>4</td>\n",
              "      <td>0</td>\n",
              "      <td>A</td>\n",
              "      <td>[3, 4, 6]</td>\n",
              "      <td>Errors60</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-cdd00747-090b-4d6e-8dc2-e0888d445bd9')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-cdd00747-090b-4d6e-8dc2-e0888d445bd9 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-cdd00747-090b-4d6e-8dc2-e0888d445bd9');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-2a770083-32cd-4ca0-9e5e-2465d9ad6b53\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-2a770083-32cd-4ca0-9e5e-2465d9ad6b53')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-2a770083-32cd-4ca0-9e5e-2465d9ad6b53 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 33
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "#create new columns to attach other outcomes\n",
        "df1[\"tally_chg\"] = df1.form_proper\n",
        "df1[\"unprt_int\"] = df1.form_proper\n",
        "df1[\"p_excess\"] = df1.form_proper\n",
        "df1[\"po_prs\"] = df1.form_proper\n",
        "\n",
        "#attach other outcomes from original data\n",
        "df1[\"tally_chg\"] = np.where(df1[\"tally_chg\"] == \"A\", df_k.loc[df1.id].r1A_tally_chg, df_k.loc[df1.id].r1B_tally_chg)\n",
        "df1[\"unprt_int\"] = np.where(df1[\"unprt_int\"] == \"A\", df_k.loc[df1.id].r1A_unprt_int, df_k.loc[df1.id].r1B_unprt_int)\n",
        "df1[\"p_excess\"] = np.where(df1[\"p_excess\"] == \"A\", df_k.loc[df1.id].r1A_p_excess, df_k.loc[df1.id].r1B_p_excess)\n",
        "df1[\"po_prs\"] = np.where(df1[\"po_prs\"] == \"A\", df_k.loc[df1.id].r1A_po_prs, df_k.loc[df1.id].r1B_po_prs)\n",
        "\n",
        "#The final step is to use the dictionary below to decode tally_chg, unprt_int..etc into numeric values\n",
        "factors_dct = {\n",
        "    'Strongly disagree': 1,\n",
        "    'Somewhat disagree':2,\n",
        "    'Neither agree nor disagree': 3,\n",
        "    'Somewhat agree': 4,\n",
        "    'Strongly Agree': 5}"
      ],
      "metadata": {
        "id": "cC5uhfyKQ_MY"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "df1.iloc[0:3, 20:35]"
      ],
      "metadata": {
        "id": "KY7Ou5JYP-Zu",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 193
        },
        "outputId": "240b0f27-b750-42d5-8350-d29d9e672d04"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  form_proper  treatment     label  presiding  party1  party2  observer  \\\n",
              "0           A  [2, 5, 6]  Errors66          0       1       0         0   \n",
              "1           A  [3, 5, 6]  Errors25          0       0       1         0   \n",
              "2           A  [3, 4, 6]  Errors60          0       0       1         1   \n",
              "\n",
              "   party1_cros  party2_cros          tally_chg          unprt_int  \\\n",
              "0            1            1  Strongly disagree  Strongly disagree   \n",
              "1            1            1     Somewhat agree     Strongly Agree   \n",
              "2            0            1     Somewhat agree     Somewhat agree   \n",
              "\n",
              "            p_excess             po_prs  \n",
              "0  Strongly disagree  Somewhat disagree  \n",
              "1     Strongly Agree     Strongly Agree  \n",
              "2     Somewhat agree  Somewhat disagree  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-2acd1e0f-cc8b-4332-b719-2d581528b87e\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>form_proper</th>\n",
              "      <th>treatment</th>\n",
              "      <th>label</th>\n",
              "      <th>presiding</th>\n",
              "      <th>party1</th>\n",
              "      <th>party2</th>\n",
              "      <th>observer</th>\n",
              "      <th>party1_cros</th>\n",
              "      <th>party2_cros</th>\n",
              "      <th>tally_chg</th>\n",
              "      <th>unprt_int</th>\n",
              "      <th>p_excess</th>\n",
              "      <th>po_prs</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>A</td>\n",
              "      <td>[2, 5, 6]</td>\n",
              "      <td>Errors66</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>Strongly disagree</td>\n",
              "      <td>Strongly disagree</td>\n",
              "      <td>Strongly disagree</td>\n",
              "      <td>Somewhat disagree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>A</td>\n",
              "      <td>[3, 5, 6]</td>\n",
              "      <td>Errors25</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>Somewhat agree</td>\n",
              "      <td>Strongly Agree</td>\n",
              "      <td>Strongly Agree</td>\n",
              "      <td>Strongly Agree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>A</td>\n",
              "      <td>[3, 4, 6]</td>\n",
              "      <td>Errors60</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>Somewhat agree</td>\n",
              "      <td>Somewhat agree</td>\n",
              "      <td>Somewhat agree</td>\n",
              "      <td>Somewhat disagree</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2acd1e0f-cc8b-4332-b719-2d581528b87e')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-2acd1e0f-cc8b-4332-b719-2d581528b87e button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-2acd1e0f-cc8b-4332-b719-2d581528b87e');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-d15fca61-f718-4b22-9d0d-4c793a76dd93\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-d15fca61-f718-4b22-9d0d-4c793a76dd93')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-d15fca61-f718-4b22-9d0d-4c793a76dd93 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 57
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "#**Function**\n",
        "\n",
        "Now that I have gone over the core steps involved in cleaning one round of the conjoint experiment, I combine all the above steps in the below function that produces a list of three dataframes (one for each round of the experiment) for a given Qualtrics dataframe and country code (either 'ky' or 'mw')."
      ],
      "metadata": {
        "id": "KRXFsxBg_m-H"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "def conjclean(df_input, country):\n",
        "\n",
        "  #creation of empty list to store dataframes for each round of the conjoint experiment\n",
        "  df_list = []\n",
        "\n",
        "  #creation of the dictionary converting people's beliefs about the likelihood of fraud into numbers\n",
        "  likelihood_dct = {\n",
        "    'Very unlikely': 1,\n",
        "    'Somewhat unlikely':2,\n",
        "    'Neither likely nor unlikely': 3,\n",
        "    'Somewhat likely': 4,\n",
        "    'Very likely': 5}\n",
        "\n",
        "  #creation of dicitionary for converting the variables on reasons for suspected fraud into numbers.\n",
        "  factors_dct = {\n",
        "    'Strongly disagree': 1,\n",
        "    'Somewhat disagree':2,\n",
        "    'Neither agree nor disagree': 3,\n",
        "    'Somewhat agree': 4,\n",
        "    'Strongly Agree': 5}\n",
        "\n",
        "  #dicitonary storing the correct answer for question on observer groups in Kenya and Malawi\n",
        "  kn_quest = {'mw': ['Malawi Electoral Support Network (MESN)', 'National Initiative for Civic Education (NICE)'],\n",
        "            'ky': ['Elections Observation Group (ELOG)']}\n",
        "  ########################################################################################\n",
        "\n",
        "  for i in [1,2,3]:\n",
        "    in_1 = 'r' + str(i) + '_fchoice'\n",
        "    in_2 = 'r' + str(i) + '_like_A'\n",
        "    in_3 = 'r' + str(i) + '_like_B'\n",
        "\n",
        "    temp = df_input[['id', 'age', 'educ', 'gender', 'ethnicty', 'reg_vote', 'vot_2017','close_to_pty', 'party', 'trust_EMB', 'chf_j_kq', 'monitor_kq',\n",
        "                  ' rec_mishaps', 'direct_voters', 'counts_recs_vot', 'protect_int_party', 'r' + str(i) + '_fchoice', 'r' + str(i) + '_like_A', 'r' + str(i) + '_like_B']]\n",
        "\n",
        "    temp = pd.melt(temp, id_vars=['id', 'age', 'educ', 'gender', 'ethnicty', 'reg_vote', 'vot_2017', 'close_to_pty', 'party', 'trust_EMB', 'chf_j_kq',\n",
        "                                   'monitor_kq',' rec_mishaps', 'direct_voters', 'counts_recs_vot', 'protect_int_party', 'r' + str(i) + '_fchoice'])\n",
        "\n",
        "    #create the forced choice outcome\n",
        "    temp['chosen'] = temp.variable.str[-1:] == temp[in_1].str[-1:]\n",
        "    temp.chosen = temp.chosen.astype(int)\n",
        "\n",
        "    #creation of correctly labelled form name\n",
        "    temp['form_proper'] = temp[in_1]\n",
        "    temp['form_proper'].mask(temp.variable.str[-1:] != temp[in_1].str[-1:], temp.variable.str[-1:], inplace=True)\n",
        "\n",
        "    #Clean up variable by extracting only the last letter string\n",
        "    temp['form_proper'] = temp['form_proper'].str[-1:]\n",
        "\n",
        "    #Now that we have 'form_proper' as a variable that provides the correct form labelling, I use it in attaching the treatment assignment.\n",
        "\n",
        "    temp['treatment'] = temp['form_proper']\n",
        "\n",
        "    if i == 1:\n",
        "        temp[\"treatment\"] = np.where(temp[\"treatment\"] == \"A\", df_input.loc[temp.id].i1, df_input.loc[temp.id].i2)\n",
        "\n",
        "    if i == 2:\n",
        "        temp[\"treatment\"] = np.where(temp[\"treatment\"] == \"A\", df_input.loc[temp.id].i3, df_input.loc[temp.id].i4)\n",
        "\n",
        "    if i == 3:\n",
        "        temp[\"treatment\"] = np.where(temp[\"treatment\"] == \"A\", df_input.loc[temp.id].i5, df_input.loc[temp.id].i6)\n",
        "\n",
        "    temp['label'] = ''\n",
        "    temp['label'] = temp.treatment.str.split('_').str[-2]\n",
        "\n",
        "    temp['treatment'] = temp.treatment.str.split('_').str[-1]\n",
        "    temp['treatment'] = temp.treatment.str.extract('(\\d+)')\n",
        "\n",
        "    temp['treatment'] = temp['treatment'].apply(lambda x: [int(i) for i in x])\n",
        "\n",
        "    #create 0,1 treatment columns\n",
        "    temp['presiding'] = temp['treatment'].apply(lambda x: 1 in x)\n",
        "    temp['party1'] = temp['treatment'].apply(lambda x: 2 in x)\n",
        "    temp['party2'] = temp['treatment'].apply(lambda x: 3 in x)\n",
        "    temp['observer'] = temp['treatment'].apply(lambda x: 4 in x)\n",
        "    temp['party1_cros'] = temp['treatment'].apply(lambda x: 5 in x)\n",
        "    temp['party2_cros'] = temp['treatment'].apply(lambda x: 6 in x)\n",
        "\n",
        "    #turn column type/class into numeric\n",
        "    temp[['presiding', 'party1', 'party2','observer','party1_cros','party2_cros']] = temp[['presiding', 'party1', 'party2','observer','party1_cros','party2_cros']].astype(int)\n",
        "\n",
        "    temp[\"tally_chg\"] = temp['form_proper']\n",
        "    temp[\"unprt_int\"] = temp['form_proper']\n",
        "    temp[\"p_excess\"] = temp['form_proper']\n",
        "    temp[\"po_prs\"] = temp['form_proper']\n",
        "\n",
        "    #attach other outcomes from original data\n",
        "    temp[\"tally_chg\"] = np.where(temp[\"tally_chg\"] == \"A\", df_input.loc[temp.id][f'r{i}A_tally_chg'], df_input.loc[temp.id][f'r{i}B_tally_chg'])\n",
        "    temp[\"unprt_int\"] = np.where(temp[\"unprt_int\"] == \"A\", df_input.loc[temp.id][f'r{i}A_unprt_int'], df_input.loc[temp.id][f'r{i}B_unprt_int'])\n",
        "    temp[\"p_excess\"] = np.where(temp[\"p_excess\"] == \"A\", df_input.loc[temp.id][f'r{i}A_p_excess'], df_input.loc[temp.id][f'r{i}B_p_excess'])\n",
        "    temp[\"po_prs\"] = np.where(temp[\"po_prs\"] == \"A\", df_input.loc[temp.id][f'r{i}A_po_prs'], df_input.loc[temp.id][f'r{i}B_po_prs'])\n",
        "\n",
        "    #turn likelihood into factor variable\n",
        "    temp['value'].replace(likelihood_dct, inplace=True)\n",
        "\n",
        "    #using the repeated'r1_like_' string in this column, extract the first to characters to create a variable indicating the round/iteration of the conjoint\n",
        "    temp.variable = temp.variable.str[:2]\n",
        "\n",
        "    #rename the 'value' and 'variable' column to reflect their new information/prupose.\n",
        "    temp.rename(columns={'value': 'lik_fraud'}, inplace=True)\n",
        "    temp.rename(columns={'variable': 'run'}, inplace=True)\n",
        "\n",
        "    #rename the forced choice variable\n",
        "    temp.rename(columns={f'r{i}_fchoice':'fchoice_og'}, inplace=True)\n",
        "\n",
        "    #creating a variable the detects whether respondents correctly identified an observer group, conditional on country\n",
        "    temp['obs_kq'] = temp['monitor_kq'].apply(lambda x: any(x in kn_quest[country] for x in x.split(',') ))\n",
        "    temp.obs_kq = temp.obs_kq.astype(int)\n",
        "\n",
        "    #converting the responses about people's reasons for suspecting fraud into numbers using the earlier created dictionary\n",
        "    temp['tally_chg2'] = temp['tally_chg'].replace(factors_dct)\n",
        "    temp['unprt_int2'] = temp['unprt_int'].replace(factors_dct)\n",
        "    temp['p_excess2'] = temp['p_excess'].replace(factors_dct)\n",
        "    temp['po_prs2'] = temp['po_prs'].replace(factors_dct)\n",
        "\n",
        "    #creation of other variables for analysis\n",
        "    temp['pic_num'] = temp.label.str.extract('(\\d+)')              #a variable for the picture number of the form\n",
        "    temp['pic_num'] = temp['pic_num'].values.astype(int)           #ensure that this is a numeric variable\n",
        "\n",
        "    temp['err_typ'] = temp['label'].str.replace(\"(\\d+)\", \"\", case=False, regex=True) #a variable for whether the form is clean/marked\n",
        "    temp.err_typ = temp.err_typ.str.lower()                                          #turn label to lower case\n",
        "\n",
        "    df_list.append(temp)\n",
        "\n",
        "  return df_list"
      ],
      "metadata": {
        "id": "XVOz_9KiIC0P"
      },
      "execution_count": 34,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "#**Run Function on Country Data**"
      ],
      "metadata": {
        "id": "fmUEuB1w2V_X"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#running the function on the Kenya and Malawi dataframes\n",
        "mal_lst = conjclean(df_m,'mw')\n",
        "ken_lst = conjclean(df_k,'ky')"
      ],
      "metadata": {
        "id": "WM1pOkVV2cLB"
      },
      "execution_count": 35,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#collapsing the list of dataframes (from each experiment round) into single dataframes for each country\n",
        "df_mal = pd.concat(mal_lst , ignore_index=True)\n",
        "df_ken = pd.concat(ken_lst , ignore_index=True)"
      ],
      "metadata": {
        "id": "SzTT_bzX2qYa"
      },
      "execution_count": 36,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "df_mal['country'] = 'malawi'\n",
        "df_ken['country'] = 'kenya'"
      ],
      "metadata": {
        "id": "Anhd9SyKGQt_"
      },
      "execution_count": 37,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "#**Combination code**"
      ],
      "metadata": {
        "id": "R227iap4FoEq"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#order the observations so that ids appear in sequential groups\n",
        "df_mal.sort_values(by=['id'], inplace=True, )\n",
        "\n",
        "#fixing index back to consecutive sequence after reordering by id\n",
        "df_mal.reset_index(drop = True, inplace=True)"
      ],
      "metadata": {
        "id": "FRsHNhW7EY4i"
      },
      "execution_count": 38,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#attaching a suffix to the the Kenyan ids so that they are unique (to avoid duplication of id tags when combined with Malawi data)\n",
        "df_ken.id = df_ken['id'].apply(lambda x: str(x)+'_k')\n",
        "\n",
        "#use the same re-ordering code\n",
        "df_ken.sort_values(by=['id'], inplace=True, )\n",
        "df_ken.reset_index(drop = True, inplace=True)"
      ],
      "metadata": {
        "id": "5ehviuj-FPOc"
      },
      "execution_count": 39,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#combine Kenya and Malawi data\n",
        "full_df = pd.concat([df_mal, df_ken], ignore_index=False)"
      ],
      "metadata": {
        "id": "wOuNuiYjRhNU"
      },
      "execution_count": 40,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "#**Other Variable Creation**\n",
        "\n",
        "In this section I use the variables 'pic_num' to create a variable noting whether the vote difference in the form image was large or small. Clean images have a total of 4*4 possible treatment combinations."
      ],
      "metadata": {
        "id": "eUWeMB8XHhND"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Note of Error**\n",
        "\n",
        "I used '32' instead of '16' in condition below. This is a problem for only the clean forms: I have a total of 32 images for the clean forms and those with a picture number ('pic_num') greater than 16 are the images that show a large vote difference. Since I wrote 32 instead of 16 in the condition, it means that my condition does not filter out the forms with a large vote difference. I have placed both the incorrect and the correct code below. The graphs impacted by this error two appendix items: (1) the graph on differences in outcomes by vote difference and (2) the graph on differences in outcome for winners vs. losers - given that I build the winner-loser variable (in RStudio) using the'vote_df' variable below."
      ],
      "metadata": {
        "id": "e_mQgQAZXa1d"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Incorrect Version**"
      ],
      "metadata": {
        "id": "Lyo57r1Ba3Xh"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "#create a boolean mask/conditional filter for the vote difference variable\n",
        "cond = [full_df.err_typ.eq('errors') & (full_df.pic_num > 48), full_df.err_typ.eq('clean') & (full_df.pic_num > 32)] #original (incorrect) mask code for appendix item\n",
        "\n",
        "#outcome list for the case where conditions = TRUE\n",
        "choices = ['large', 'large']\n",
        "\n",
        "#use mask to create a new column\n",
        "full_df['vote_df'] = np.select(cond, choices, default='small')"
      ],
      "metadata": {
        "id": "I11AeTGuHg9j"
      },
      "execution_count": 41,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Correct Version of the Code**\n",
        "\n",
        "I have just created a new column with the corrected version of the mask/filter for re-evaluation in the Rstudio analysis:"
      ],
      "metadata": {
        "id": "lOl61urcXfRT"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "cond2 = [full_df.err_typ.eq('errors') & (full_df.pic_num > 48), full_df.err_typ.eq('clean') & (full_df.pic_num > 16)] #correct masking for appendix item\n",
        "\n",
        "#outcome list for the case where conditions = TRUE\n",
        "choices = ['large', 'large']\n",
        "\n",
        "#apply mask to\n",
        "full_df['vote_df2'] = np.select(cond2, choices, default='small')"
      ],
      "metadata": {
        "id": "MDzxnQ9iVBpV"
      },
      "execution_count": 42,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Creation of a second variable that is just a descriptive string of the number of party agents on the form**"
      ],
      "metadata": {
        "id": "pLnzVXzCbCKb"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "cond3 = [(full_df.party1 == 0) & (full_df.party2 == 0), (full_df.party1 == 1) & (full_df.party2 == 1)]\n",
        "choices_party = ['no agent', 'two agents']\n",
        "full_df['agt_num'] = np.select(cond3, choices_party, default='one agent')"
      ],
      "metadata": {
        "id": "ZlqMVtbzzDjD"
      },
      "execution_count": 43,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "#**Data Preview**"
      ],
      "metadata": {
        "id": "mRFU2Cir91d1"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "full_df.iloc[0:10, 15:30]"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 566
        },
        "id": "oVCVbCSF-f2c",
        "outputId": "59ff0b0b-7d84-4c99-8d70-b07e41efb3e9"
      },
      "execution_count": 44,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   protect_int_party fchoice_og run  lik_fraud  chosen form_proper  \\\n",
              "0  Political parties     Form B  r1          5       0           A   \n",
              "1  Political parties     Form A  r2          1       1           A   \n",
              "2  Political parties     Form A  r3          5       0           B   \n",
              "3  Political parties     Form A  r3          4       1           A   \n",
              "4  Political parties     Form A  r2          5       0           B   \n",
              "5  Political parties     Form B  r1          2       1           B   \n",
              "6                MEC     Form A  r1          5       1           A   \n",
              "7                MEC     Form A  r2          2       1           A   \n",
              "8                MEC     Form B  r3          1       1           B   \n",
              "9                MEC     Form A  r1          1       0           B   \n",
              "\n",
              "         treatment     label  presiding  party1  party2  observer  \\\n",
              "0        [1, 4, 6]  Errors27          1       0       0         1   \n",
              "1  [1, 2, 3, 4, 5]  Errors94          1       1       1         1   \n",
              "2           [3, 6]   Errors7          0       0       1         0   \n",
              "3     [1, 4, 5, 6]  Errors39          1       0       0         1   \n",
              "4           [4, 6]  Errors11          0       0       0         1   \n",
              "5     [1, 2, 3, 5]  Errors34          1       1       1         0   \n",
              "6     [1, 2, 4, 6]  Errors76          1       1       0         1   \n",
              "7        [1, 4, 6]  Errors27          1       0       0         1   \n",
              "8              [0]    Clean1          0       0       0         0   \n",
              "9        [1, 4, 6]  Errors75          1       0       0         1   \n",
              "\n",
              "   party1_cros  party2_cros                   tally_chg  \n",
              "0            0            1              Strongly Agree  \n",
              "1            1            0           Strongly disagree  \n",
              "2            0            1              Somewhat agree  \n",
              "3            1            1              Somewhat agree  \n",
              "4            0            1              Strongly Agree  \n",
              "5            1            0  Neither agree nor disagree  \n",
              "6            0            1              Strongly Agree  \n",
              "7            0            1              Somewhat agree  \n",
              "8            0            0           Strongly disagree  \n",
              "9            0            1           Strongly disagree  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-0b9a3748-6303-4c06-84b0-d90f16c6f81c\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>protect_int_party</th>\n",
              "      <th>fchoice_og</th>\n",
              "      <th>run</th>\n",
              "      <th>lik_fraud</th>\n",
              "      <th>chosen</th>\n",
              "      <th>form_proper</th>\n",
              "      <th>treatment</th>\n",
              "      <th>label</th>\n",
              "      <th>presiding</th>\n",
              "      <th>party1</th>\n",
              "      <th>party2</th>\n",
              "      <th>observer</th>\n",
              "      <th>party1_cros</th>\n",
              "      <th>party2_cros</th>\n",
              "      <th>tally_chg</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1</td>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>A</td>\n",
              "      <td>[1, 4, 6]</td>\n",
              "      <td>Errors27</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>Strongly Agree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r2</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>A</td>\n",
              "      <td>[1, 2, 3, 4, 5]</td>\n",
              "      <td>Errors94</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>Strongly disagree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r3</td>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>B</td>\n",
              "      <td>[3, 6]</td>\n",
              "      <td>Errors7</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>Somewhat agree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r3</td>\n",
              "      <td>4</td>\n",
              "      <td>1</td>\n",
              "      <td>A</td>\n",
              "      <td>[1, 4, 5, 6]</td>\n",
              "      <td>Errors39</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>Somewhat agree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r2</td>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>B</td>\n",
              "      <td>[4, 6]</td>\n",
              "      <td>Errors11</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>Strongly Agree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>Political parties</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r1</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>B</td>\n",
              "      <td>[1, 2, 3, 5]</td>\n",
              "      <td>Errors34</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>Neither agree nor disagree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>MEC</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r1</td>\n",
              "      <td>5</td>\n",
              "      <td>1</td>\n",
              "      <td>A</td>\n",
              "      <td>[1, 2, 4, 6]</td>\n",
              "      <td>Errors76</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>Strongly Agree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>MEC</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r2</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>A</td>\n",
              "      <td>[1, 4, 6]</td>\n",
              "      <td>Errors27</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>Somewhat agree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>MEC</td>\n",
              "      <td>Form B</td>\n",
              "      <td>r3</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>B</td>\n",
              "      <td>[0]</td>\n",
              "      <td>Clean1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Strongly disagree</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9</th>\n",
              "      <td>MEC</td>\n",
              "      <td>Form A</td>\n",
              "      <td>r1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>B</td>\n",
              "      <td>[1, 4, 6]</td>\n",
              "      <td>Errors75</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>Strongly disagree</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-0b9a3748-6303-4c06-84b0-d90f16c6f81c')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-0b9a3748-6303-4c06-84b0-d90f16c6f81c button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-0b9a3748-6303-4c06-84b0-d90f16c6f81c');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-f6dd8ccd-e7fd-4a2f-9a4e-9d54b6b3613d\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-f6dd8ccd-e7fd-4a2f-9a4e-9d54b6b3613d')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-f6dd8ccd-e7fd-4a2f-9a4e-9d54b6b3613d button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 44
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "**Export Final Dataframe**"
      ],
      "metadata": {
        "id": "PZrBGqRgSABd"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "full_df.to_csv(r'/filepath/rep_full_df.csv')"
      ],
      "metadata": {
        "id": "H7jPZdk6R_l5"
      },
      "execution_count": 45,
      "outputs": []
    }
  ]
}